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Abstract — An encoder observes a point pattern — a finite num- 
ber of points in the interval [0, T] — which is to be described to 
a reconstructor using bits. Based on these bits, the reconstructor 
wishes to select a subset of [0, T] that contains all the points in 
the pattern. It is shown that, if the point pattern is produced 
by a homogeneous Poisson process of intensity A, and if the 
reconstructor is restricted to select a subset of average Lebesgue 
measure not exceeding DT, then, as T tends to infinity, the 
minimum number of bits per second needed by the encoder 
is AlogD. It is also shown that, as T tends to infinity, any 
point pattern on [0, T] containing no more than AT points can 
be successfully described using AlogD bits per second in this 
sense. Finally, a Wyner-Ziv version of this problem is considered 
where some of the points in the pattern are known to the 
reconstructor. 

I. Introduction 

An encoder observes a point pattern — a finite number of 
points in the interval [0, T] — which is to be described to a 
reconstructor using bits. Based on these bits, the reconstructor 
wishes to produce a covering-set — a subset of [0, T] containing 
all the points — of least Lebesgue measure. There is a trade-off 
between the number of bits used and the Lebesgue measure 
of the covering-set. This trade-off can be formulated as a 
continuous-time rate-distortion problem (Section IrlrV In this 
paper we investigate this trade-off in the limit where T — > oo. 

When the point pattern is produced by a homogeneous 
Poisson process, this problem is closely related to that of 
transmitting information through an ideal peak-limited Poisson 
channel (TJ, 0, Q, J4J. In fact, the two problems can be 
considered dual in the sense of [5 1. However, the duality results 
of llH only apply to discrete memoryless channels and sources, 
so they cannot be directly used to solve our problem. Instead, 
we shall use a technique that is similar to Wyner's 0, flU to 
find the desired rate-distortion function. We shall show that, 
if the point pattern is the outcome of a homogeneous Poisson 
process of intensity A, and if the reconstructor is restricted 
to select covering-sets of average measure not exceeding DT, 
then the minimum number of bits per second needed by the 
encoder to describe the pattern is — AlogZ?. 

Previous works J6), Q have studied rate-distortion func- 
tions of the Poisson process with different distortion measures. 
It is interesting to notice that our rate-distortion function, 
— AlogD, is equal to the one in [7], where a queueing 
distortion measure is considered. This is no coincidence, since 
the Poisson channel is closely related to the queueing channel 
introduced in Jg]. 



We also show that the Poisson process is the most difficult 
to cover, in the sense that any point process that, with high 
probability, has no more than AT points in [0, T] can be 
described with — AlogD bits per second. This is even true if 
an adversary selects the point pattern provided that the pattern 
contains no more than A points per second and that the encoder 
and the reconstructor are allowed to use random codes. 

Finally, we consider a Wyner-Ziv setting [9| of the problem 
where some points in the pattern are known to the recon- 
structor but the encoder does not know which ones they are. 
This can be viewed as a dual problem to the Poisson channel 
with noncausal side-information |10|. We show that in this 
setting one can achieve the same minimum rate as when the 
transmitter does know the reconstmctor's side-information. 

The rest of this paper is arranged as follows: in Section UT1 we 
introduce some notation; in Section [Hi] we present the result 
for the Poisson process; in Section ITVl we present the results 
for general point processes and arbitrary point patterns; and 
in Section [V] we present the results for the Wyner-Ziv setting. 

II. Notation 

We use a lower-case letter like x to denote a number, and an 
upper-case letter like X to denote a random variable. We use a 
boldface lower-case letter like x to denote a vector, a function 
of reals, or a point pattern, and it will be clear from the context 
which one we mean. If x is a vector, Xi denotes its ith element. 
If x is a function, x(t) denotes its value at t € E. If x is a 
point pattern, we use n x (-) to denote its counting function, so 
^x(*2) — n x (ti) is the number of points in x that fall in the 
interval (£j.,i2]. We use a bold-face upper-case letter like X 
to denote a random vector, a random function, or a random 
point process. The random counting function corresponding to 
a point process X is denoted by iVx(-). 

We use Ber(p) to denote the Bernoulli distribution of 
parameter p, namely, the distribution that has probability p 
on the outcome 1 and probability (1 — p) on the outcome 0. 

III. Covering a Poisson Process 

Consider a homogeneous Poisson process X of intensity A 
on the interval [0, T], Its counting function iVx(-) satisfies 



Pr [tf x (t + t) - JVx(i) = k] 



k\ 



for all r e [0, T], t € [0,T - r] and k E {0, 1, . . .}. 

The encoder maps the realization of the Poisson process 
to a message in {!,..., 2 TR }. The reconstructor then maps 



this message to a {0, l}-valued, Lebesgue-measurable, signal 
x(t), t £ [0, T]. We wish to minimize the total length of the 
region where x(t) = 1 while guaranteeing that all points in 
the original Poisson process lie in this region. See Figure Q] 
for an illustration. 
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Fig. I . Illustration of the problem. 

More formally, we formulate this problem as a continuous- 
time rate-distortion problem, where the distortion between the 
point pattern x and the reproduction signal x is 

d(x,x)4 



oo, 



if all points in x are in x 1 (1) 
otherwise 



(1) 

where denotes the Lebesgue measure. 

We say that (R, D) is an achievable rate-distortion pair for 
the homogeneous Poisson process of intensity A if, for every 
e > 0, there exists some T) > such that, for every T > To, 
there exists an encoder /t(-) and a reconstructor 4>t{-) of 
rate R + e bits per second which, when applied to the Poisson 
process X on [0,T], gives 

E[d(X,0r(/ T (X)))] <D + e. 

Denote by R(D, A) the minimum rate R such that (R, D) is 
achievable for the homogeneous Poisson process of intensity 
A. Define 

A / -A log D bits per second, D e (0, 1) 
R Pois (D,X) = < D > 1. (2) 



Theorem 1: For all D, A > 0, 

R(D : X) = R Pois {D,X). 



(3) 



To prove Theorem \T\ we propose a scheme to reduce the 
original problem to one for a discrete memoryless source. 
This is reminiscent of Wyner's scheme for reducing the peak- 
limited Poisson channel to a discrete memoryless channel Q. 
We shall show the optimality of this scheme in Lemma [T] and 
we shall then prove Theorem Q] by computing the best rate 
that is achievable using this scheme. 

Scheme 1: We divide the time-interval [0, T] into slots of A 
seconds long. The encoder first maps the original point pattern 
x to a {0, l}-valued vector x' of length ^| in the following 



'When T is not divisible by A, we consider x as a pattern on [0, T"] where 
T" = when 

we let A tend to zero, the difference between T and 
T" also tends to zero. Henceforth we ignore this technicality and assume T 
is divisible by A. 



way: if x has at least one point in the time-slot ((i — 1) A, iA), 
choose x' { = 1; otherwise choose x\ = 0. The encoder then 
maps x' to a message in {1, . . . , 2 TR }. 

Based on the encoder's message, the reconstructor produces 
a {0,l}-valued length--^ vector x' to meet the distortion 
criterion 



d'(X',X.') 



< D 



where the distortion measure d'(-, •) is given by 



d'(0,0) 


= 


d'(0,l) 


= 1 


d'(l,0) 


= oo 


d'(M) 


= 1. 



It then maps x' to a continuous-time signal x through 

x(t) = x\ iv te[0,T\. 

Scheme 1 reduces the task of designing a code for X subject 
to distortion d(-, •) to the task of designing a code for the 
vector X' subject to the distortion d'(-, ■). The way we define 
d'(-, ■) yields the simple relation 



d(x, x) = d (x , x'). 



(4) 



When X is the homogeneous Poisson process of intensity 
A, the components of X' are independent and identically 



distributed (IID) Ber(l- 



- AA 



). Let i?A {D, A) denote the rate- 



distortion function for X' and <i' (•,•). If we combine Scheme 1 
with an optimal code for X' subject to E d'(X', X) < D+e, 
we can achieve any rate that is larger than 

R A {D,X) bits 
A seconds 

The next lemma, which is reminiscent of (0] Theorem 2.1], 
shows that when we let A tend to zero, there is no loss in 
optimality in using Scheme 1. 

Lemma 1: For all D, A > 0, 

Ra(D,X) 



R(D,X) 



lim ■ 

A10 



(5) 



Proof: See Appendix. ■ 
Proof of Theorem Q} We derive R(D, A) by computing 
the right-hand side of (0. To compute Ra(D,X) we apply 
Shannon's formula of the rate-distortion function for a discrete 
memoryless source ifTTl : 

I(Z;Z)E 



Ra(D,X) 

Pz lz :E[d A (Z:Z\)<D 

When D £ (0,1) 

achieves the minimum on the right-hand side of © is 

De XA - AA 



(6) 



the conditional distribution P%\ z which 



(1|0) 



z\z 



P!, z (i|i) = i. 



2 Strictly speaking, since our distortion measure is unbounded, we need to 
modify Shannon's proof of this formula in order to use it for our problem. This 
can be done by letting the reconstructor produce the all-one sequence, which 
yields bounded distortion for any source sequence, whenever no codeword 
can be found that is jointly typical with the source sequence. 



Computing the mutual information I(Z;Z) under this P^ z 
yields 

R A (D,X) = H b (D)-e- XA H b (De XA -e XA +l), D G (0,1), 

(7) 

where H\,(-) denotes the binary entropy function. 

When D > 1, it is optimal to choose Z = 1 (deterministi- 
cally), yielding 

R A (D,X)=0, D>1. (8) 

Combining (0, (|7]i and dS) and computing the limit as A 
tends to zero yields ([3). ■ 

IV. Covering General Point Processes and 
Arbitrary Point Patterns 
We next consider a general point process Y. We assume 
that there exists some A such that 
'JV Y (i) 



lim Pr 



t 



> \ + 5 



= for all S > 0. 



(9) 



Condition (0 is satisfied, for example, when Y is an ergodic 
process whose expected number of points per second is less 
than or equal to A. 

Since the Poisson process is memoryless, one naturally 
expects it to be the most difficult to describe. This is indeed 
the case, as the next theorem shows. 

Theorem 2: The pair (Rp ols (D, A), D) is achievable on any 
point process satisfying (|9). 

Before proving Theorem |2] we state a stronger result. 
Consider a point pattern z chosen by an adversary on the 
interval [0, T] which contains no more than AT points. The 
corresponding counting function n z (-) must then satisfy 

n z {T) < XT. (10) 

The encoder and the reconstructor are allowed to use random 
codes. Namely, they fix a distribution on all (deterministic) 
codes of a certain rate on [0, T]. According to this distribution, 
they randomly pick a code which is not revealed to the 
adversary. They then apply it to the point pattern z chosen by 
the adversary. We say that (R, D) is achievable with random 
coding against an adversary subject to (fT0l > if, for every e > 0, 
there exists some T such that, for every T > T , there exists 
a random code on [0, T] of rate R + e such that the expected 
distortion between any z satisfying ( flOb and its reconstruction 
is smaller than D + e. 

Theorem 3: The pair (Rp \ s (D, A), D) is achievable with 
random coding against an adversary subject to (flOt . 

Proof: First note that when D > 1, the encoder does not 
need to describe the pattern: the reconstructor simply produces 
the all-one function, yielding distortion 1 for any z. Hence the 
pair (0, D) is achievable with random coding. 

Next consider D G (0, 1). We use Scheme 1 as in SectionlHll 
to reduce the original problem to one of random coding for an 
arbitrary discrete-time sequence z'. Here z' is {0, l}-valued, 
has length and satisfies 



T/A 

5>*< AT. 



We shall construct a random code of rate ^ which, when 
applied to any z' satisfying (Hit , yields 



d'(z',Z') 



<£> + £, 



where the random vector Z' is the result of applying the 
random encoder and decoder to z'. Combined with Scheme 1 
this random code will yield a random code on the continuous- 
time point pattern z that achieves the rate-distortion pair 
(R,D). 

Our discrete-time random code consists of 2 TR {0,1}- 
valued, length--^ random sequences Z' m , m G {1, . . . , 2 TR }. 
The first sequence 7J X is chosen deterministically to be the 
all-one sequence. The other 2 TR — 1 sequences are drawn 
independently, with each sequence drawn IID Ber(D). 

To describe source sequence z', the encoder looks for a 
codeword z' n , m G {2, . . . , 2 TR } such that 



1 whenever z' = 1. 



(12) 



If it finds one or more such codewords, it sends the index of 
the first one; otherwise it sends 1. The reconstructor outputs 
the sequence z' m where m is the message it receives from the 
encoder. 

We next analyze the expected distortion of this random code 
for a fixed z' satisfying ( fTTT i. Define 

yT/A , 



/i 



T 



(ID 



and note that by ( fTTT) fi < A. Denote by £ the event that the 
encoder cannot find z' m , m G {2, . . . , 2 TR } satisfying H2i . If 
£ occurs, the encoder sends 1 and the resulting distortion is 
equal to 1. 

The probability that a randomly drawn codeword Z' n satis- 
fies (fT2l is 

D p.T > D \T _ 2 (A log D)T_ 

Because the codewords Z' m , m G {2, . . . , 2 TR } are chosen 
independently, if we choose R > — AlogD, then Pr[£] — >■ 
as T — > oo. Hence, for large enough T, the contribution to 
the expected distortion from the event £ can be ignored. 

We next analyze the expected distortion conditional on £ c . 
The reproduction Z' has the following distribution: at positions 
where z' takes the value 1, Z' must also be 1; at other positions 
the elements of Z' have the IID Ber(D) distribution. Thus the 
expected value of YaIi is M T + D ( S ~ M^), and 



d'(z',Z') £ c = D + (1 - D)(j,A. 



When we let A tend to zero, this value tends to D. We have 
thus shown that, for small enough A, we can achieve the pair 
(i?/A, D) on z' with random coding whenever R > — A log D, 
and therefore we can also achieve (R, D) on the continuous- 
time point pattern z with random coding if R > — AlogD. 

■ 

We next use Theorem [3] to prove Theorem |2] 

Proof of Theorem^} It follows from Theorem [3] that, on 
any point process satisfying (O, the pair (_Rp ; s (_D, A + S), D) 



is achievable with random coding. Further, since there is no 
adversary, the existence of a good random code guarantees the 
existence of a good deterministic code. Hence (Rp \ s (D, X + 
S),D) is also achievable on this process with deterministic 
coding. Theorem [2] now follows when we let S tend to zero, 
since i?p ; s (_D, •) is a continuous function. ■ 

V. Some Points are Known to the Reconstructor 

In this section we consider a Wyner-Ziv setting for our 
problem. We first consider the case where X is a homogeneous 
Poisson process of intensity A. (Later we consider an arbitrary 
point pattern.) Assume that each point in X is known to the 
reconstructor independently with probability p. Also assume 
that the encoder does not know which points are known 
to the reconstructor. The encoder maps X to a message in 
{1, . . . , 2 TR }, and the reconstructor produces a Lebesgue- 
measurable, {0, l}-valued signal X on [0, T] based on this 
message and the positions of the points that he knows. The 
achievability of a rate-distortion pair is defined in the same 
way as in Section [III] Denote the smallest rate R for which 
(R,D) is achievable by Rwz(D, X,p). 

Obviously, i?wz(-D , X,p) is lower-bounded by the smallest 
achievable rate when the transmitter does know which points 
are known to the reconstructor. The latter rate is given by 
Rpois{D, (1 — p)X), where i?p i s (-, •) is given by (0. Indeed, 
when the encoder knows which points are known to the 
reconstructor, it is optimal for it to describe only the remain- 
ing points, which themselves form a homogeneous Poisson 
process of intensity (1 — p)X. The reconstructor then selects a 
set based on this description to cover the points unknown to 
it and adds to this set the points it knows. Thus, 

R wz {D,X 7 p) > R Pois (D,(l-p)X). (13) 

The next theorem shows that (TT~3T > holds with equality. 

Theorem 4: Knowing the points at the reconstructor only is 
as good as knowing them also at the encoder: 

R WZ {D, X,p) = R Pois {D, (1 -p)X). (14) 

To prove Theorem @] it remains to show that the pair 
(-Rpois(-D) (1 ~p)X), D) is achievable. We shall show this as a 
consequence of a stronger result concerning arbitrarily varying 
sources. 

Consider an arbitrary point pattern z on [0, T] chosen by an 
adversary. The adversary is allowed to put at most AT points 
in z. Also, it must reveal all but at most vT points to the 
reconstructor, without telling the encoder which points it has 
revealed. The encoder and the reconstructor are allowed to use 
random codes, where the encoder is a random mapping from z 
to a message in {1, . . . , 2 TR }, and where the reconstructor is 
a random mapping from this message, together with the point 
pattern that it knows, to a {0, l}-valued, Lebesgue-measurable 
signal z. The distortion d(z, z) is defined as in ((TJ. 

Theorem 5: Against an adversary who puts at most AT 
points on [0, T] and reveals all but at most vT points to 
the reconstructor, the rate-distortion pair (Rp olii (D, v) : D) is 
achievable with random coding. 



Proof: The case D > 1 is trivial, so we shall only 
consider the case where D G (0,1). The encoder and the 
reconstructor first use Scheme 1 as in Section [En] to reduce 
the point pattern z to a {0, l}-valued vector z' of length J. 
Define 



and note that, by assumption, (jl < X. If /x < v, then we can 
ignore the reconstructor's side-information and use the random 
code of Theorem [3] Henceforth we assume /i > v. 

Denote by s the point pattern known to the reconstructor and 
by s' the vector obtained from s through the discretization in 
time of Scheme 1. Since there are at most vT points that are 
unknown to the reconstructor, 

T/A 

X) > (Ai - v)T. (15) 

1=1 

The encoder conveys the value of /iT to the receiver using 
bits. Since /j,T is an integer between and AT, the number of 
bits per second needed to describe it tends to zero as T tends 
to infinity. 

Next, the encoder and the reconstructor randomly generate 
2T(R+R) independent codewords 

z^, me {!,..., 2™}, le{l,...,2 TR }, 

where each codeword is generated IID Ber(T>). 

To describe z', the encoder looks for a codeword z' m t such 
that 

z'm,i.i = 1 whenever z[ = 1. (16) 

If it finds one or more such codewords, it sends the index m 
of the first one; otherwise it tells the reconstructor to produce 
the all-one sequence. 

When the reconstructor receives the index m, it looks for 
an index I e {1, . . . , 2 TR } such that 

z! r = 1 whenever s' = 1. (17) 

If there is only one such codeword, it outputs it as the 
reconstruction; if there are more than one such codewords, 
it outputs the all-one sequence. 

To analyze the expected distortion for z' over this random 
code, first consider the event that the encoder cannot find 
a codeword satisfying (ITBT l. Note that the probability that a 
randomly generated codeword satisfies ([TBI is D^ T , so the 
probability of this event tends to zero as T tends to infinity 
provided that 

R + R>-n\ogD. (18) 

Next consider the event that the reconstructor finds more 
than one I satisfying (1171 1. The probability that a randomly 
generated codeword satisfies (117> is D^*=i Consequently, 
by (fl~5T > the probability of this event tends to zero as T tends 
to infinity provided 

R < -(/i- v)\ogD. (19) 



Finally, if the encoder finds a codeword satisfying ([Tol l and 
the reconstructor finds only one codeword satisfying dlTt . 
then the two codewords must be the same. Following the 
same calculations as in the proof of Theorem [3] the expected 
distortion in this case tends to D as A tends to zero. 

Combining ( fT~8l > and ( fl9l i. we can make the expected dis- 
tortion arbitrarily close to D as T — > oo if 

R > -vlogD. 

■ 

Proof of Theorem I?} The claim follows from ( fT3] >. 
Theorem [5] and the Law of Large Numbers. ■ 

Appendix 

In this appendix we prove Lemma Q] Given any rate- 
distortion code with 2 TR codewords x m , me {1, . . . , 2 TR } 
that achieves expected distortion D, we shall construct a new 
code that can be constructed through Scheme 1, that contains 
(2 TR + 1) codewords, and that achieves an expected distortion 
that is arbitrarily close to D. 

Denote the codewords of our new code by w m , m € 
{1, . . . , 2 TR + 1}. We choose the last codeword to be the con- 
stant 1 . We next describe our choices for the other codewords . 
For every e > and every x m , we can approximate the set 
{t: x m (t) = 1} by a set A m that is equal to a finite, say N m , 
union of open intervals. More specifically, 

/i^-^LjAAn) <2- TR e, (20) 

where A denotes the symmetric difference between two sets 
(see, e.g., Ifl2l Chapter 3, Proposition 15]). Define 

2 TR 

B= IJ (^(1) \An), 

771—1 

and note that by (f20t 

l*(B) < e. (21) 
For each A m , m 6 {1, . . . , 2 TR }, define 

T m ^{te [0, T] : (fltyAl - 1) A, \t/A] A] n An f 0} . 
We now construct w m , m E {1, . . . , 2 TR } as 

W.,„ = 1-7^5 

where 1$ denotes the indicator function of the set S. Note 
that A m QT m — i&~ 1 (l). See Figure [2] for an illustration of 
this construction. Let 
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Fig. 2. Constructing w m from A: 



N = max N m - 

me{l 2 TB } 

It can be seen that 

f i(w- 1 (l))-n(A m )<2NA, m e {1,...,2 TR }. (22) 

Our encoder works as follows: if x contains no point in £>, it 
maps x to the same message as the given encoder; otherwise 
it maps x to the index (2 TR + 1) of the all-one codeword. To 
analyze the distortion, first consider the case where x contains 
no point in B. In this case, all points in x must be covered by 
the selected codeword w m . By ( f20b and ( f22l ). the difference 
d(x, w m ) — d(x, Xm), if positive, can be made arbitrarily small 
by choosing small e and A. Next consider the case where x 
does contain points in B. By (I2TT) . the probability that this 
happens can be made arbitrarily small by choosing e small, 
therefore its contribution to the expected distortion can also be 
made arbitrarily small. We conclude that our code {w m } can 
achieve a distortion that is arbitrarily close to the distortion 
achieved by the original code {x m }. This concludes the proof 
of Lemma [T] 
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